-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor json_schema.py
, implement JSON Schema to YAML
#1182
base: main
Are you sure you want to change the base?
Conversation
db73b4c
to
c939c70
Compare
ce488b6
to
3a28324
Compare
3a28324
to
9648b30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSON schema code is now in outlines-core
. Unless there was an issue opened for that to be done separately, it looks like we missed it in #1175.
These changes will need to be moved and—potentially—ported to Rust. At the very least, we can't have two versions of the same basic JSON schema logic.
9648b30
to
3efb728
Compare
Seems there was a miscommunication. Thanks for clarifying @brandonwillard, I'll get started on porting the changes to Rust. |
3efb728
to
99fa1ec
Compare
Overview
Refactor
json_schema.py
to be more coherent and extensible. Use extensibility to implement JSON Schema to YAML.Changes
to_regex
into a classJSONSchemaRegexGenerator
with visitors which implement JSON Schema rules, and formatters which implement pattern construction.YAMLRegexGenerator
by subclassingJSONSchemaRegexGenerator
and overriding some formatters.Tests:
test_json_schema.py
so it's existing tests also apply to YAML.anyOf
andallOf
)test_generate.py::test_generate_json
, test both json and yaml modes.Behavioral Changes
The only behavior changes are:
NotImplementedError
anyOf
,allOf
,oneOf
anyOf
: Previously broken, now ORs sub-patternsallOf
: Previously broken, now ANDs sub-patterns via positive lookaheadoneOf
: Warns user that it's usinganyOf
instead, and callsanyOf
The rules are much closer to the JSON Schema spec with
main
, however JSON Schema spec isn't always desirable. Users can legalize the JSON Schema compliant validation rules viastrict_json_schema_subset=False
, resulting in:items
: If unspecified, allow additional items without constraintsproperties
: If unspecified, allow additional properties without constraintsjson-schema.org test suite
This is a large change-set. To verify correctness, in addition to ensuring current tests pass,
test_json_schema_full.py
tests compliance with JSON Schema by retrieving 1,245 test cases from the official json-schema.org test suite.main
NotImplementedError
(acceptable: visible)Raising
NotImplementedError
makes it clear to the user why a schema would fail during generation, and it does so before generation.test_json_schema_to_yaml_compliance
For each of the 263 tests which pass in
test_json_schema_to_json_compliance
, we test to verify their corresponding yaml pattern is also correct.TODO
json_schema
so its clean and extensibletest_json_schema_full.py
to yamlUpdate docs to reflect new behaviour surrounding JSON Schema spec-compliant implementationstrict_json_schema_subset
Further Work
json_schema.py
does too much. This new structure makes separation of concerns clear, easing a refactor.JSONSchemaRegexGenerator.to_automata(...)
Not using a pattern intermediate would simplify things.NotImplemented
components based on users opening issues.